Vaestra Goetaland
- North America > United States (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Peru (0.04)
- (34 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- Europe > Latvia > Riga Municipality > Riga (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Russia (0.14)
- (92 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- (2 more...)
- Media (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Security & Privacy (1.00)
- (10 more...)
Active preference learning for ordering items in-and out-of-sample Herman Bergström Chalmers University of Technology and University of Gothenburg hermanb@chalmers.se Emil Carlsson
Learning an ordering of items based on pairwise comparisons is useful when items are difficult to rate consistently on an absolute scale, for example, when annotators have to make subjective assessments. When exhaustive comparison is infeasible, actively sampling item pairs can reduce the number of annotations necessary for learning an accurate ordering. However, many algorithms ignore shared structure between items, limiting their sample efficiency and precluding generalization to new items. It is also common to disregard how noise in comparisons varies between item pairs, despite it being informative of item similarity. In this work, we study active preference learning for ordering items with contextual attributes, both in-and out-of-sample. We give an upper bound on the expected ordering error of a logistic preference model as a function of which items have been compared. Next, we propose an active learning strategy that samples items to minimize this bound by accounting for aleatoric and epistemic uncertainty in comparisons. We evaluate the resulting algorithm, and a variant aimed at reducing model misspecification, in multiple realistic ordering tasks with comparisons made by human annotators. Our results demonstrate superior sample efficiency and generalization compared to non-contextual ranking approaches and active preference learning baselines.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.40)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > Greenland (0.04)
- (10 more...)
- Law (1.00)
- Government (0.93)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hong Kong (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (23 more...)
- Health & Medicine > Therapeutic Area > Immunology (0.93)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
- Education (0.67)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
- (3 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
Detecting and Mitigating Treatment Leakage in Text-Based Causal Inference: Distillation and Sensitivity Analysis
Daoud, Adel, Johansson, Richard, Jerzak, Connor T.
Text-based causal inference increasingly employs textual data as proxies for unobserved confounders, yet this approach introduces a previously undertheorized source of bias: treatment leakage. Treatment leakage occurs when text intended to capture confounding information also contains signals predictive of treatment status, thereby inducing post-treatment bias in causal estimates. Critically, this problem can arise even when documents precede treatment assignment, as authors may employ future-referencing language that anticipates subsequent interventions. Despite growing recognition of this issue, no systematic methods exist for identifying and mitigating treatment leakage in text-as-confounder applications. This paper addresses this gap through three contributions. First, we provide formal statistical and set-theoretic definitions of treatment leakage that clarify when and why bias occurs. Second, we propose four text distillation methods -- similarity-based passage removal, distant supervision classification, salient feature removal, and iterative nullspace projection -- designed to eliminate treatment-predictive content while preserving confounder information. Third, we validate these methods through simulations using synthetic text and an empirical application examining International Monetary Fund structural adjustment programs and child mortality. Our findings indicate that moderate distillation optimally balances bias reduction against confounder retention, whereas overly stringent approaches degrade estimate precision.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > India (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (9 more...)
- Government (1.00)
- Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.49)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)